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Abstract 

Suppose that /i is an absolutely continuous probability measure on R", for large n. 
Then /j, has low-dimensional marginals that are approximately spherically-symmetric. 
More precisely, if n > (C/e)^'', then there exist d-dimensional marginals of that 
are e-far from being spherically-symmetric, in an appropriate sense. Here C > is a 
universal constant. 

1 Introduction 

The purpose of this paper is to clarify a seven line paragraph by Gromov ifTTl Section 
1.2.F]. We are interested in projections of high-dimensional probability measures. Not all 
probability measures on M", for large n, are truly n-dimensional. For instance, a measure 
supported on an atom or two should not be considered high-dimensional. Roughly speak- 
ing, we think of a probability measure on a linear space as decently high-dimensional if 
any subspace of bounded dimension contains only a small fraction of the total mass. 

Definition 1.1 Let fi be a Borel probability measure on M" and e > Q. We say that fi is 
"decently high-dimensional with parameter e", or "e-decent" in short, if for any linear 
subspace E C M", 

ii{E) <edmi{E). (1) 
We say that fi is decent if it is e-decent for e = 1 jn, the minimal possible value of e. 

Clearly, all absolutely continuous probability measures on R" are decent, as are many 
discrete measures. Note that a decent measure /i necessarily satisfies /i({0}) = 0, how- 
ever, this feature should not be taken too seriously. A measure /i is "weakly e-decent" if 
([T]l holds for all subspaces E C IR" except E = {0}. For a measure /i on a measurable 
space and a measurable map T : O — > fi', we write (/i) for the push-forward of /i 
under T, i.e., 

r,(M)(A) = Mr-i(A)) 

for all measurable sets A C When is a probability measure on M" and T : M" 

is a linear map with £ < n, we say that (/x) is a marginal of /i, or a measure projection 
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of /i. The classical Dvoretzky theorem asserts that appropriate geometric projections of 
any high-dimensional convex body are approximately Euclidean balls (see Milman [23l 
and references therein). The analogous statement for probability measures should perhaps 
be the following (see Gromov [1 1 j): Appropriate measure projections of any decent high- 
dimensional probability measure are approximately spherically-symmetric. When can we 
say that a probability measure /i on M'' is approximately radially-symmetric? 

We need some notation. Let /i be a finite measure on a measurable space H.. For a 
subset A C with ij,{A) > we write for the conditioning of /i on A, i.e., 

f,\A{B)^fi{AnB)/fi{A) 

for any measurable set B C ft. Write S'^^^ for the unit sphere centered at the origin in 
M.'^. The uniform probability measure on the sphere S"^^^ is denoted by (Td-i- For two 
probability measures /i and on the sphere S"^"-^ and 1 < p < oo we write Wp{ii, v) for 
the LP Monge-Kantorovich transportation distance between /i and v in the sphere 5^^^^ 
endowed with the geodesic distance (see, e.g., 1331 or Section|2]below). The metrics Wp 

are all equivalent (we have Wi < Wp < ttW^^'^) and they metrize weak convergence 
of probability measures. For an interval J C (0, oo) we consider the spherical shell 
S{J) = {a; e M.'^; \x\ e J}, where | • | is the standard Euclidean norm in W^'. The radial 
projection in R'' is the map TZ{x) = x/\x\. An interval is either open, closed or half-open 
and half-closed. 

Definition 1.2 (Gromov [12|) Let fibe a Borel probability measure on and let e > 0. 

We say that /i is "s-radial" if for any interval J C (0, oo) with /i(S'( J)) > s, we have 

W^l(7^,(M|s(J)),^d-l) <£• 

That is, when we condition p on any spherical shell that contains at least an e-fraction 
of the mass, and then project radially to the sphere, we obtain an approximation to the 
uniform probability measure on the sphere in the transportation-metric sense. 

Note that this definition is scale-invariant. We think of the dimension n from Defini- 
tion 11.11 as a very large number, tending to infinity. On the other hand, we usually view 
the dimension d in Definition 1 1.21 as being fixed, and typically not very large. The case 
d = 1 of Definition ! 1 .2l corresponds to the measure being approximately even. We are not 
sure whether Dirac's measure (5o is a good example of an e-radial measure. An e-radial 
measure p is said to be "proper" if /i({0}) = 0. Our main theorem reads as follows: 

Tlieorem 1.3 There exists a universal constant C > Ofor which the following holds: Let 
< e < 1 and let d, n be positive integers. Suppose that 

n > [C/ef'K (2) 

Then, for any decent probability measure fi on M", there exists a linear map T : R" 
such that (/i) is e-radial proper 

Furthermore, let i] > be such that ri~^ > (C/e)^'^. Then, for any rj-decent prob- 
ability measure fi on R", there exists a linear map T : M" — > such that T* (/x) is 
e-radial proper 
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Gromov has a topological proof for the cases d = 1, 2 of Theorem 1 1.31 which does 
not seem to generalize to higher dimensions fT2l|, fT2\. Theorem II. 31 is tight, up to the 
value of the constant C, as demonstrated by the example where /i is distributed uniformly 
on n linearly independent vectors: In this case n is decent, but for any linear map T and 
an interval J, the discrete measure 7?.((T*^)|5(j-)) is composed of at most n atoms. It 
is not difficult to see that when the support of v contains no more than e^^''^^^ points, 
we have the lower bound Wi{v, (id-i) > ce, for a certain universal constant c > 0. It 
is desirable to find the best constant in the exponent in Theorem 1 1.31 perhaps also with 
respect to other notions of e-radial measures. 

The conclusion of Theorem ll.3l also holds when the measure fi is assumed to be only 
"weakly e-decent", except that (/x) is no longer necessarily proper Another possibility 
in this context is to allow affine maps in Theorem ll.3l in place of linear maps, and obtain 
a measure (/i) which is e-radial proper (It is also possible to modify Definition 11.11 
slightly, and require that ([T]) hold for all affine subspaces of dimension at least one. The 
effect of such a modification is minor, since an e-decent measure will remain at most 
2£-decent after such a change). 

The conclusion of Theorem II. 3 1 does not necessarily hold for non-decent measures, 
even when their support spans the entire M": Let ei, . . . , e,i be linearly independent vec- 
tors in K", and consider the probability measure /i = (1 — 2^")^^ X^ILi 2^*(5ei, where 
5x is Dirac's unit mass at a; G M". Then is not decent, and none of the two-dimensional 
marginals of fi are e-radial proper, for e = 1/10. 

As in Milman's proof of Dvoretzky's theorem (see ll23l '). Theorem I 1.31 will be proved 
by demonstrating that a random linear map T works with positive probability, once the 
measure /i is put in the right "position". That is, we first push-forward ji under an appro- 
priate invertible linear map in K", which is non-random, and only then do we project the 
resulting probability measure to a random rf-dimensional subspace, distributed uniformly 
in the Grassmannian. The measure is in the correct "position" when the covariance ma- 
trix of 7?.* /J, is proportional to the identity matrix. If we assume that the covariance matrix 
of /i itself is proportional to the identity, then a random projection will not work, in gen- 
eral, with high probability (compare with Sudakov's theorem; see ll29l or the presentation 
in Bobkov 0). 

Here is an outline of the proof of Theorem 11.31 and also of the structure of this 
manuscript; In Section |5] we use the non-degeneracy conditions from Definition 11.11 in 
order to guarantee the existence of the initial linear transformation that puts /i in the right 
"position". Once we know that the covariance matrix of 7?.*/i is approximately a scalar 
matrix, we prove that the measure /i may be decomposed into many almost-orthogonal 
ensembles. Each such ensemble is simply a discrete probability measure, uniform on a 
collection of approximately-orthogonal vectors in R" that are not necessarily of the same 
length. This decomposition, which essentially appeared earlier in the work of Bourgain, 
Lindenstrauss and Milman |j6|, is discussed in SectionH) Section[3]is concerned with the 
analysis of a single ensemble of our decomposition. As it turns out, a random projection 
works with high probability, and transforms the discrete measure into an almost-radial 
one. Section |2] contains a preliminary discussion regarding e-radial measures and the 
transportation metric. The proof of Theorem |1.3| is completed in Section|6l in which we 
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also make some related comments and prove the following corollary to Theorem ll.3l 

Corollary 1.4 There exists a sequence i?„ — > oo with the following property: Let fj, 
be a decent probability measure on M". Then, there exists a non-zero linear functional 
: R" ^ M such that 

H{{x](p{x) > tM}) > cexp(-Ct^) for alio < t < R„ 

and 

fi ({x; ip{x) < ~tM}) > c exp(-Ci^) for allO<t< Rn 
where M > is a median, that is, 

fi{{x]\ip{x)\ < M}) >l/2 and ^{{x;\ip{x)\ > M}) > 1/2 (3) 

and c,C > are universal constants. Moreover, one may take Rn — c(log ri)^/^. 

In other words, any high-dimensional probability measure has super-gaussian marginals. 
Furthermore, as is evident from the proof, most of the marginals are super-gaussian when 
the measure is in the right "position". In the case of independent random variables. Corol- 
lary [L4] essentially goes back to Kolmogorov ||20| . See also Nagaev ||25l . 

In SectionQwe formulate our results in an infinite-dimensional setting. Unless stated 
otherwise, throughout the text the letters c, C, C", c etc. stand for various positive uni- 
versal constants, whose value may change from one instance to the next. We usually 
denote by lower-case c, c, c', c etc. positive universal constants that are assumed suffi- 
ciently small, and by upper-case C, C, C", C etc. sufficiently large universal constants. 
We write x ■ y for the usual scalar product of x,y G M". 

Acknowledgments. I would like to thank Misha Gromov for his interest in this work 
and for introducing me to the problem, to Vitali Milman for encouraging me to write this 
note, to Boris Tsirelson for his explanations regarding measures on infinite-dimensional 
linear spaces, to Sasha Sodin for reading a preliminary version of this text and to Semyon 
Alesker, Noga Alon and Apostolos Giannopoulos for related discussions. 

2 Transportation distance and empirical distributions 

Let {X,p) be a metric space and let /ii,/^2 be Borel probability measures on X. A 
coupling of /ii and /i2 is a Borel probability measure 'y on X x X whose first marginal 
is /xi and whose second marginal is /i2, that is, (Pi)*7 ~ jii and (P2)*7 = M2 where 
Pi{x,y) — X and P2 {x,y) — y. The Monge-Kantorovich distance is 

Wiipi, fi2) = iril p{x,y)d'y{x,y) 

T JxxX 

where the infimum runs over all couplings 7 of /ii and ^2- Then Wi is a metric, and it 
satisfies the convexity relation 

Wi (Ami + (1 - AV2,J^) < XWi{pi,iy) + (1 - X)Wi{p2,i^) (4) 
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for any < A < 1 and probability measures ^2,v on X. The Kantorovich-Rubinstein 
duality theorem (see Il33i Theorem 1.14]) states that 



Wi{ii, v) = swp / ipd[fi~iy] (5) 
V Jx 

where the supremumruns overall 1-Lipschitz functions (p : X M. (i.e., \Lp{x)~(p{y)\ < 
p{x, y) for all x,y £ X). We are concerned mostly with the case where the metric 
space X is the Euclidean sphere S^~^ with the metric p{x, y) being the geodesic distance 
in S'^~^^, i.e., cos p{x,y) = x ■ y. Denote by M(5'^^^) the space of Borel probability 
measures on S'^^^, endowed with the weak* topology and the corresponding Borel a- 
algebra. Similarly, M(M'') is the space of Borel probability measures on R'^, endowed 
with the weak* topology (convergence of integrals of compactly-supported continuous 
functions) and a-algebra. A measure here always means a non-negative measure. The 
total variation distance between two measures /i and on a measurable space ft is 

drviiJ^.v) = sup l^(yl) - v{A)\ 

where the supremumruns over all measurable sets A QQ,. Clearly, for /i, G M(S''*^^), 

Wi(p, v) < irdrvip, ^) — (6) 

Additionally, dTv{S*fJ., S^v) < dTv{lJ-,t^) for any measures fijV and a measurable 
map 5. When S* is a A-Lipschitz map between metric spaces, we obtain the inequal- 
ity Wi{S^,p, 3^,1') < XWi{fi, v). The following lemma is an obvious consequence of ^ 
and Q, via Jensen's inequality. 

Lemma 2.1 Let d be a positive integer, < e < 1 and p G M(5'^~'^). Suppose that 
we are given a "random probability measure" on S'^~^. That is, let X be a probability 
measure on a measurable space Q, and suppose that with any a d Q we associate a 
probability measure fia G M(S''^'^^) such that the map 9 a i— s- /i^ G M(5'''~^) is 
measurable. Assume that 

drv (yl^- j l^adX{a)^ < e. 

Then, 

Wi{p.,ad~i) < / Wi{p.a,(Td-i)dX{a) +TTe < snpWi{na,(Td^i) 
Jn Qef2 



Recall that nix stands for the conditioning of /i to X. 



Lemma 2.2 Suppose that p, and v are finite measures on a measurable space and let 
e > 0. Let X <Z n be such that I'iX) > e. Suppose that 

- < e for all A ex. 

Then dTv{l^\x,v\x) < 2e/v{X). 
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Proof: For any A C X, 



\u\x{A)-^i\x{A)\ 



e e 2e 

< 



i^iX) t/(X) i^{X) ■ 



v{X) fi{X) v{X) 

since ^i{A) < fi{X). □ 
Next we describe a few simple properties of e-radial measures. 

Lemma 2.3 Let d be a positive integer and < e < 1/2. Let /i, v be Borel probability 
measures on W^. Additionally, assume that we are given a "random probability measure " 
on R'^. That is, let X be a probability measure on a measurable space 17. Suppose that 
with any a G we associate a measure pa G M(]R'^) such that the map Q, 3 a ^-^ pa ^ 
M(R'^) is measurable. 

(a) Suppose that p is e-radial and that dTvil^-, i') < Then v is he-radial. 

(b) Suppose that iia is e-radial for any a G 12. Assume that A* — /n l^ad\{a). Then p, 
is Ay^-radial. 

(c) Suppose that A C_ Q satisfies \{A) > I ~ e, and pa is e-radial for any a £ A 
Assume that fJ- = padX{a). Then p is 20y/e-radial. 

Proof: Begin with (a). Let J C (0,oo) be an interval with h'{S{J)) > 2e, where 
S{J) = {x e R'*; G J} is a spherical shell. Denote lyj — I'lsi.j) and pj — fJ,\s(j)- 
Since dxvil^: v) < e^, we may apply Lemma IZ2] with and X = S{J). We conclude 
that dTvitJ-J^ ^j) < 2(e^)/2e — e. Consequently, 

dTv{n.{p.j),'R*{vj)) <e. (!) 

Since /i is e-radial and J)) > 2e—e'^ > e,thenWi{TZ^{pj),ad^i) < eaccordingto 
Definition |1.2| From (|6]l, O and the triangle inequality, M^i(7?,*(zyj), iTd-i) < (7r+l)e < 
5e. This completes the proof of (a). 

We move to the proof of (b). Let J C (0, oo) be an interval with fi{S{J)) > iy/e. 
Let X = {a G il; paiS{J)) > e}. Denote v = j-^ padX{a), a finite Borel measure on 
R". Then for any A C S(J), 

\ti{A) ~ v{A)\ = / p^{A)dX{a) < [ fia{S{J))dXia) < eX{n \X)<e. (8) 
Jn\x Jn\x 



Denote /ij = /^|s(j) and i/j — i^|s(,/). From dHJ and Lemma l272l 

drvil^j, < 2e/(4Vi) = V^/2. (9) 

Note that vj = pa\s{J)dX' {a) where A' is a certain probability measure on X. Since 
Pa is e-radial and pa{S{J)) > e for a G X, then from Definition ll.2l 



Wi{'Jl4pa\s{j)),<yd-i) <e foraeX. (10) 
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We have TZ^,{vj) — j-^Tl^,{iJLa\s(j))d^' [o)- Thus, ( fTOl l and Lemma ITT] yield that 
Wi {TZif{vj), cTd-i) < £■ Combining the last inequality with ^ and (|9]l, we see that 

Wi (n,{iij),ad^i) < 4dTv {n,{pj),n4Kj))+Wi {n,{vj),ad-i) < 2Vi+e < 4Vi. 

Since = fJ,\s(j), and J C (0, cx)) is an arbitrary interval with /i(S'(J)) > iy/e, the 
assertion (b) is proven. 

To prove (c), denote v = j^^ iXad\\A{a), a probability measure on W^. Then v is 
4-y/e -radial, according to (b). Furthermore, clearly A) = 1 — \{A) < e, and 

hence dry (M: i') < £ < (4\/£)^- According to part (a), the measure ^ is -radial, 
and (c) is proven. □ 

Probability measures are the protagonists of this text. Some of our constructions of 
probability measures are probabilistic in nature. To avoid confusion, we try to distinguish 
sharply between the measures themselves, and the randomness used in their construction. 
Whenever we have objects that are declared random (for instance, random vectors in 
S'^~^), all statements containing probability estimates or using the symbol P refer to these 
random objects and only to them. 

The crude bound in the following lemma is certainly a standard application of the so- 
called "empirical distribution method" (see, e.g., [5]). We were not able to find it in the 
literature, so a proof is provided. Recall that 5^ stands for the Dirac unit mass at the point 

X. 



Lemma 2.4 Let d, N be positive integers, and let Xi, . . . , Xn be independent random 
vectors, distributed uniformly on S'^~^. Denote ji = N^^ X^iLi ^x^- Then, with proba- 
bility greater than 1 — C exp(— c\/]V) of selecting Xi, . . . , Xfq, 

C 

where C > 1 and < c < 1 are universal constants. 

Proof: Denote by JF the class of all 1-Lipschitz functions Lp : S'^~^ M such that 
/ ipdad-i = 0. Note that sup \(p\ < tt for any ip e !F. According to (|5]l, 

cTd-i) = sup / ipdfi. (11) 

Let £ > be a parameter to be specified later on. A subset TV C S^^^ is an e-net if for 
any x e S'^^^ there exists y £ Af with p{x, y) < e. Let TV be an e-net of cardinality 
#(TV) < (C/e)'' (see, e.g.. ||26l Lemma 4.161). For G denote 

ifi{x) = inm {e\<f{y)/e'] + p{x,y)) , 
yeAf 

where \a] is the minimal integer that is not smaller than a. Then is a 1-Lipschitz 
function, as a minimum of 1-Lipschitz functions. It is easily verified that f < (p < 
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tf + 3e. Denote 'p°{x) = (p{x) — J (p{y)d(Jd-i{y)- Then tf° £ T for any ip d T, and 
sup — (p| < 3e. Hence, 

1 ^ 

T^i(/^, cTrf-i) = sup / ipd^ < 3e + sup / Lp°d^ = 3e + sup — > (f°{Xi). 

(12) 

Denote T = {(p\^p G and T° = {'4^° 'if G -^j- These sets are finite. In fact, as each 
G is determined by the restriction ip\j\/, we have 



27r 



#(^°) < #(^) ^ ( — + 1 ) ^ exp 



(13) 



Fix (p° G JF°. Then is a 1-Lipschitz function on the sphere S'^ ^ with J (f°dad-i = 0. 
According to Levy's lemma (see Milman and Schechtman ll24l Section 2 and Appendix 
V]), foranyi = 1,...,7V, 

P(|¥'°(^»)| > t) < Cexp(-rf2d) Vi>0, 

where P refers, of course, to the probability of choosing the random vectors Xi , . . . , Xn- 
From Bernstein's inequality (see, e.g., Bourgain, Lindenstrauss and Milman 161 Proposi- 
tion 1]), 



c't^Nd) Vt > 0. 



Set < = £ in (O. From O and ([T4]i, 



[ sup ^jY.^°{Xi) >e\<C' 



exp ((C/e) 



2d 



c'e^Nd) 



(14) 



(15) 



We now select e = CiV i/(2<i+2)^ for a sufficiently large universal constant C > 0. 
Substitute the value of e in ( fTSl ) and apply ( fT2] i. to deduce that 

1 " 

< 36+ sup -V <^°(X0 < 4£ < c'iV-i/(2'^+2), 

with probabiUty greater than 1 - C" exp(-c' A^''/ (''+1) ) of selecting Xi , . . . , Xat. □ 
Remark. The discrepancy of /i e M(S'''^^) is defined as 

Dip) = snMB) - <Jd-i{B)\ 

B 

where the supremum runs over all geodesic balls B C S'^^^ . A result analogous to 
Lemma 123] for discrepancy appears in Beck and Chen [3] Section 7.4]. It is possible to 
adapt our technique to suit the discrepancy metric, and some of its variants, in place of 
W\ . In fact, the only properties of the metric W\ that are used in our proof are Lemma 
12.41 and ^ and (|6]l above. Our method, of course, works for the Wp metrics as long as 
1 < p < oo, but it does not seem to apply for the Woo metric. The Woo metric induces 
a topology that is much stronger than weak convergence, and it is not even weaker than 
convergence in norm. 
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3 Isotropic Gaussians 



A centered Gaussian random vector in M"* is a random vector whose density is propor- 
tional to a; 1-^- exp(— Ma; • x) for a positive definite matrix M. A centered Gaussian 
random vector is said to be isotropic if M is a scalar matrix. It is called standard if 
Af = Id/2, where Id is the identity matrix. Recall that TZ stands for radial projection. 



Lemma 3.1 Let d, N be positive integers and let Zi, . . . , Z^r be independent isotropic 
Gaussian random vectors in W^. Denote fi ~ T^^=i 

Then, with probability greater than 1 — C Gxp(— cTV^/'*) of selecting the Zi's, the 
measure fj, is d-radial, for d ~ CN^'^^'^. Here, C > 1 and < c < 1 are universal 
constants. 

Proof: Set e = h/N^I'^. We may assume that e < 1/10, as otherwise the conclu- 
sion of the lemma is obvious for a suitable choice of universal constants c, C > 0. The 
central observation is that the radii \Zi\, . . . , |Zjv| are independent of the angular parts 
Ti{Zi), . . . ,TI{Zn), and that the random vectors TZ{Zi), . . . ,TI{Zn) are independent 
random vectors that are distributed uniformly on S'^^^. 

With probability one, none of the |Z,;|'s are zero, and there are no i j with \Zi\ = 
\Zj\. We condition on the values \Zi\, . . . , \ Zn\, which are assumed to be distinct and 
non-zero. For an interval J C (0, oo) write 

Z{J) ={i;\Z,\e J} and u;(J) = #(Z(J)). 
Denote k = [1/e^] , and let Ji , . . . , C (0, cxd) be disjoint intervals such that 
w{J,) = [Ni/k\ - [N{i~l)/k\ fori = l,...,k. 

• • • • • • • • • 

I Jl I I J3 I I J2 I 



Figure 1 

Since e'^N > 2, then e'^N/2 < w{Ji) < 2e'^N for any i. For an interval J C (0, 00) with 
w{J) 0, denote 



E 



Fix i — 1, . . . , fc. We abbreviate /i^ = ^j. . Since {7?.(Zj)}jez(,/i) is a collection of 
w{Ji) independent random vectors, distributed uniformly on the sphere, then Lemma l24l 
applies, and yields. 

We now let i vary. Since w{Ji) has the order of magnitude of e'^N, then, 

P (^^m&x^Wi{fi^,<Jd-i) < ^^2^Y/d ) ^ 1 - Ckexp{-ceVN). (16) 
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Write X for the collection of all non-empty intervals in (0, oo). Fix an interval J with 
u;( J) > Ne. Let Ji^, . . . , Ji^ be all the intervals among the J,;'s that are contained in J. 
Then J^^ U . . . U Ji^ covers all but at most Ae^N of the jZ^j's that belong to the interval 
J. Therefore, 



where Ai, . . . , are appropriate non-negative coefficients that add to one. According to 
Lemmar2.ll 



Wi{^ij,ad^i) < max Wi(^ii,ad-i) + 2Qe for allJ e X with w( J) > iVe. 

i—l,...,k 

We thus conclude from (fTSI l that with probability at least 1 — Ck exp(— ceViV), 

Vt^i(MJ,^d-i) < (^2^)c/d +^0^ forallJeXwithii;(J) >7Ve. (17) 

The latter probability bound is valid under the conditioning on | | , . . . , | Z^r | , and it holds 
for all possible values of |Zi|, . . . , \ Zn\, up to measure zero. Hence the aforementioned 
probability bound for ([TtI i also holds with no conditioning at all. Recall that we write 

S{J) = {a; e R"; |a;| e J}, and note that /ij = ■R4ji\s[j)) and w{J) = Nn{S{J)). 
Since e^/N > N^^^, then ( fTTI ) translates as follows: With probability greater than 1 — 
Cexp(-c7Vi/'^) of selecting Zi, . . . , Za,, 

l^l(7^*(Ai|s(J)),c^d-l) < CN-'^^'^ + Ce for all J € X with ^(S'(J)) > e. 

This means that /iis C(A^^^/'^+£)-radial with probability greater than 1— Cexp(— cA^^/'*). 
Since N'"/'^ + e < C'A^"'^'/'', the lemma is proven. □ 

Lemma 3.2 Let k be a positive integer. For an invertible kx k matrix A, write jAfor the 
probability measure on Mf' whose density is proportional to x exp(— |Axp/2). Then, 
for any k x k invertible matrices A and B, 

dTv{lA,lB) < Ck\\BA-^-Id\\ 

where Id is the identity matrix, \\ ■ \\ is the operator norm, and C > is a universal 
constant. 

Proof: Let X be a standard Gaussian random vector in K'^. Then^A{U) = P [A~^X € 
for any measurable set [/ C M*^. Therefore, 

dTvhA,lB)^ sup \P{X eU)-P{AB-^X eU)\^dTvhBA-^,lId)■ 
UCB.'< 

Denote M ~ BA^^, write 7 — jid and set e = \\M — Id\\ — sup^^^^i \Mx — x\. We 
write ipm{x) = (det A/)(27r)^'*/^ exp(— |A/a;p/2) for the density of jm and similarly 
(f stands for the density of 7. We may assume that e < 1/2, as otherwise the conclusion 
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of the lemma is trivial. Then ||Ma;p — \x\'^\ < 3e|a;p for any x G M'^, and also (1 
2e)-'= < detM < (1 + e)''. Therefore, 



\^ix)- (Pm{x)\ 

|2 - |Mxp 



tp{x) 



1 - (det M) exp 



< 



^{x) [(l + 2e)'=cxp(3£|a;n-l] 



for any a; G M'" . Consequently, 



drv {l,-fM) ^ ^ I \'p{x)-ipM{x)\dx<{l + 2e)'' I exp(3e|x|''^).^(a;)da; - 1. 
However, 



exp{3e\x\^)ip{x)dx = {2t:)-'^''^ / exp - Qex\'^ /2) dx = {1 - 6e)" 



/c/2 



We deduce that 

dTV (7,7m) < (1 + 2e)'=(l - ee)-"/^ - 1 < Cfe, 

under the legitimate assumption that e < c/k (otherwise, there is nothing to prove). □ 
Consider the map £ : {W^)^ M{W^) defined by 

N 

N ■ 



1 ^ 



A Borel probability measure a on (R*^)^ thus induces the Borel probability measure £^,a 
on the space M(M''). The next lemma is a small perturbation of Lemma [3T| 



Lemma 3.3 Let d, N be positive integers, e > 0. Let Xi , . . . , X^q be independent, stan- 
dard Gaussian random vectors in W^. Let {aij)i<j<i<N be real numbers, with an ^ 
for all i, such that 

Wijl < eWiil for j < i. (18) 



Denote Yi — X]j<i ^ij-^j ^'^^ consider the probability measure ^ — N ^ X^i^i ^Yi- 

Then, with probability greater than 1 — C exp(— ciV^^^) — CN'^d^e of selecting the 
random vectors Xi, . . . , X^, the measure ji is 5-radialfor 5 = CN^''/'^. Here, C > 1 
and < c < 1 are universal constants. 

Proof: Denote Zi ~ auXi. The Z^'s are independent, isotropic, Gaussian random 
vectors in M.'^. Denote by [/ C M(M'^') the collection of all (5-radial probability measures, 
where S = CN-"/'^ is the same as in Lemma [3T1 Let a be the probability measure on 
(R"^)^ that is the joint distribution of Zi, . . . , Zm- According to Lemma [TTl 



{£.*a){U) > 1 -Cexp{-cN^/^). 
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Let /3 be the probability measure on (M'')^ which is the joint distribution of Yi , . . . , Y^. 
To prove the lemma, we need to show that 

{£*f3){U) > 1 - Cexp(-ciVi/^) - C'N^d^e. 
This would follow, if we could prove that 

drv (£*/3)) < dry (a, /3) < C'N^d^e. (19) 

Let k ~ dN . Let A be the k x k matrix that represents the linear operator 

^ (j^d-jw 3 {^xi,...,xn) ^ {aiiXi,...,aMNXN) e (M'^)^ = R^ 

That is, A is a diagonal matrix, and the diagonal of A contains each number an exactly 
d times. Denoting X = {Xi,...,Xn) <E (E'')^ = R'^ and Z = {Zi,...,Zn) G 
(M'^)^ = M*^, we see that Z = AX. Therefore, in the notation of Lemma [52] we have 
a = 7^-1. 

Similarly, let B be the k x k matrix that corresponds to the linear operator 

j<i 

where Xi, . . . , x^ are vectors in M.'^. Denoting Y = (Yi, .... Y^) G we have Y = 
BX and consequently (3 — 7b- i . Condition ( fTSl l implies that the off-diagonal elements 
of A^^B do not exceed e in absolute value. The diagonal elements of A^^B are all ones. 
Hence — Id\\ < fee, and according to Lemma [J!2l 

dTv{a,P) = dTv{lB-^,lA-^) < Ck\\A-^B-Id\\ < Ck^e = CN'^d^e. 
Thus (fT9l l holds and the proof is complete. □ 

4 Orthogonal decompositions 

This section is concerned with probability measures on M" that may be decomposed as a 
mixture, whose components are mostly ensembles of approximately-orthogonal vectors. 
Later on, we will apply a random projection, and use Lemma [33] in order to show that the 
projection of most elements in the mixture is typically e-radial for small e > 0. 

Definition 4.1 Let £, n be positive integers and let e > Q. Let vi, . . . ,Vi G M" be non- 
zero vectors, and consider v = (wi, . . . , vg), an £-tuple of vectors. We say that v is 
"e-orthogonal" if there exist orthonormal vectors wi, . . . ,Wi £ R" and real numbers 
{ciij)i<j<i<e such that 

Vi — aijWj for i — !,...,£ 
and \aij\ < e\aii\for j < i. 
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Note that (vi, . . . , ve) is e-orthogonal if and only if {TZ{vi), . . . , TZ{vi)) is £-orthogonal. 
We write Og^e C (E")^ for the collection of all e-orthogonal ^-tuples v = {vi, . . . ,ve) E 
{W^y. For a subspace E C M" denote by ProjE the orthogonal projection operator onto 
E'in R". 

Lemma 4.2 Let n be positive integers. Suppose that fi is a Borel probability measure 
on the unit sphere 5"^^ such that for any unit vector 9 G 5*"^^, 

Let Xi, . . . , Xi be independent random vectors in S"~^, all distributed according to fi. 
Then, with positive probability, {Xi, . . . , Xi) is {i^^^ /2)-orthogonal. 

Proof: We may assume that £ > 2, otherwise the lemma is vacuously true. Write 
Wi , ■ ■ ■ , We e R" for the vectors obtained from Xi, . . . , Xe through the Gram-Schmidt 
orthogonalization process. (If the Xi's are not linearly independent, then some of the 
Wi's might be zero). For i > 2 denote by Ei the linear span of Xi, . . . , Xi^i. Then, for 
i > 2, 

as Xi is independent of Wi , . . . , Wi-i. By Chebyshev's inequality, 

V{32<i<£;\ProjE^iX,)\>e-'^/2} < {£ - 1) -^-^ < Afr^ < 1. 

Therefore, with positive probability, \ProjEi{Xi)\ < i^'^-^/2 for all i > 2. In this 
event, the vectors Xi , . . . ,Xe are linearly independent, and Wi , ■ ■ ■ , Wg are orthonor- 
mal vectors. Furthermore, in this event an := \/l — jProj^;. > 1 — £~^^/2 while 

Qij :— Xi ■ Wj satisfies 

K\ < \ProjE,iX,)\ < i-^^/2 for J < i. 

Thus, with positive probability, Xi ~ J2j<i ^ij^j f^J" ^i'^h |a.y | < (£^^°/2)|aii| 
for j < i and with Wi, . . . , We being orthonormal vectors. This completes the proof. 
Note that the "positive probability" is in fact greater than 1 — £^^. □ 

The next lemma is essentially a measure-theoretic variant of a lemma going back to 
Bourgain, Lindenstrauss and Milman [6 Lemma 12], with the main difference being that 
the logarithmic dependence is improved upon to a power law. For two Borel measures /x 
and v ona compact K we say that ji < i' if 



ipdfi < / ipdv for any continuous : iiT ^ [0, oo). (20) 

K J K 

Recall that a point is not in the support of a measure if and only if it has an open neigh- 
borhood of measure zero. We abbreviate Oi — Oej-^o. For v — {vi, . . . ,ve) € Oi 
denote 



£ 

i=l 
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a Borel probability measure on M" (in the notation of the previous section, /i„ ~ £{v)). 
When K C M", we write Oi>{K) C Oi for the collection of all (wi, . . . , -y^) e Oi with 
e is: for all i. Then ©^(ii') = n C (R")^ and it is straightforward to verify 
that Oi{K) is compact whenever K C M" is a compact that does not contain the origin. 

Lemma 4.3 Let £, n be positive integers and let < s < 1/2. Let fi be a Borel probabil- 
ity measure on M" with fJ.{{0}) — 0. Assume that 

sup / {x-e)^d'Jl,fi{x) < ^. (21) 

Then, there exists a Borel probability measure A on Oi such that 

drv (^IJ-, J fivdX{v)^<e. (22) 



Proof: Since /i({0}) = then for any 5 > we may find a large punctured ball 
K = {x e M"; r < |a;| < R} with <r < R such that fi{K) > 1 - (5. We may assume 
that /i is supported on a compact set that does not contain the origin: Otherwise, replace 
fi with for a large punctured ball K C R" with fJ,{K) > 1 — (5 as above, and observe 
that dTvifJ', A* I is") ^ ^, so the effect of the replacement on the inequalities (ISTT i and ( |22] | 
is bounded by J, which can be made arbitrarily small. Write K C K" for the support of 
fi, a compact which does not contain the origin. Denote by T the collection of all Borel 
measures A supported on Oe{K) for which 

/ fivdX{v) < fi. (23) 
Joi 

The condition ( |23] l defining is closed in the weak* topology. Furthermore, X{Oi) < 1 
for all A £ (use (l23Tl). and take (ys = 1 in the definition (l20li). Hence is a weak* 
closed subset of the unit ball of the Banach space of signed finite Borel measures on 
the compact Oe{K). From the Banach- Alaoglu theorem, !F is compact in the weak* 
topology. Therefore the continuous functional A X{Oi) attains its maximum on at 
some Ao G T. Clearly Xo{Oe) < 1. To prove the lemma, it suffices to show that 

Ao(0,)>l-e. (24) 

Indeed, if (|24] | holds, then we may define a probability measure Ai = Aq + A, where A is 
any Borel measure on Oi of total mass 1 — Xo{Oi). Then clearly 

drv (yl^, j HvdXi{v)^ < X{Oi) < e, 

and the lemma follows. We thus focus on the proof of (|24] |. Assume by contradiction that 
(|24] | fails. Denote v = n — J^^ fiydXo{v). Then z/ is a non-negative Borel measure on 
K C K", according to ( |23] l, and also v < fi. Moreover, v{K) > e, since we assume that 
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(|24] | fails. Denote v = v/v{K), a probability measure on K G R". Then v < vje and 
hence 7?.* <Tl^,{v)/£ < 7?.* For any unit vector e S*"^^, 

{x-efdn^i){x) < e-^ I [x-efdn^v{x) < e^^ / {x-9fdn^fi{x) < 

from our assumption (1211 1. Lemma l4~2l thus asserts the existence of xi, . . . , a;^ € S"""^ 
in the support of TZi,{v) such that (ii, . . . is (i!^^"/2)-orthogonal. Consequently, 
there exist non-zero vectors xi, . . . ,xi S M" in the support of v such that [xi , . . . , Xf) is 
^^-20^2)-orthogonal. Let J7i, . . . ,Ui C M" be small open neighborhoods of xi, . . . , a;^, 
respectively, such that 

(yi, ■■■,yi) e = d for all yi e Ui, . . . ,ye e Ue. 

The Ui's are necessarily disjoint and Ui x ... x Ui C Oe. Denote rj — mini=i^...^£ v{Ui). 
Then 77 > 0, since Ui is an open neighborhood of the point Xi, and the point Xi belongs to 
the support of v. We set Vi — v\u., the conditioning of v to Ui. Then Vi is a probability 
measure supported on ii' C M", and 




r]Vi < v{Ui)vi < V = ^ — I iiydXo{v) fori = !,...,£. 
Therefore, also 

dvi{yi)...dvi(yi) = r] — ' < M - / fJ.vdXo{v). 

Jot 

Consequently, the non-negative measure A = Ao +77(1^1 x ... x f^) on Ot{K) satisfies (l23T l. 
Hence A G J-", but A(C'£) = Ao(C'f ) + r/ > Ao(C'£), in contradiction to the maximality of 
Aq. We thus conclude that ( |24] | must be true, and the lemma is proven. □ 

Adx n matrix F will be called a "standard Gaussian random matrix" if the entries of 
F are independent standard Gaussian random variables (of mean zero and variance one). 
Suppose that wi, . . . are orthonormal vectors in R" and that F is a d x n standard 
Gaussian random matrix. Observe that in this case, F(wi), . . . ,T{wi) are independent 
standard Gaussian random vectors in R''. 



Lemma 4.4 Let d < £ < n be positive integers, let < s < 1 and assume that 

t > {C/ef'^. (25) 

Suppose that A is a Borel probability measure on Oi, and denote p — J^^ pv dX{v). Let 
T be a d X n standard Gaussian random matrix. 

Then, with positive probability of selecting the random matrix V, the measure T.^,p on 
R'^ is e-radial. Here, C > 1 is a universal constant. (In fact, this probability is at least 
l~£-^.) 

Proof: Fix v = (ui, . . . , w^) £ Oi. Consider the measure p^ :— F*(/i„) on R'^. Then 
fi'v ^'^*{jY.l=i^v,) = 7 Z^Li ^r(i;.)- Leti;(w) be the following event: 
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• The measure /i„ is C^-c/d.radial, where C > 1 and < c < 1 are the universal 
constants from Lemma [331 

Let us emphasize that for any v G Oi, the event E{v) might either hold or not, depending 
on the Gaussian random matrix F. We are going to apply Lemma |33] Since v ^ Oi = 
Of j-2a, then there exist orthonormal vectors wi, . . . ,Wf E M" and numbers such 
that Vi = J2j<i '^ij'^j — ^"^"IQmI for J < h with an ^ for all i. Denote 

Xi = T{wi) and Yi = J2j<i ^ij-^j- Then Xi, . . . , X( are independent standard Gaussian 
random vectors in M'', and fly — l"^ ^Yi- We may thus apply Lemma [33] (with 
N = £ and e ~ ^^^") and conclude that for any v E Og, 

f{E{v)) > 1 - Ce^j>{-cf/^) - Cf(fr^° > 1 - C'e-^^. (26) 

Let JF C Of be the collection of all v E Og for which the event E{v) holds. Then JF is a 
random subset of Oe (depending on the random matrix F). According to ( [26] l. 

= E [ l:F{v)d\{v) = [ Elyr{v)dX{v) = ( P{E{v))dX{v) > 1 - C'r^^ 

Joi Joi JOe 

where Ijr is the characteristic function of T. Therefore, by Chebyshev's inequality, 

p (A(^) < 1 - 2c'r«) < ^^^^7^ < rV2 < 1. 

We may assume that C'£~^ < 1/2, thanks to dZST l. We showed that with positive prob- 
ability X{T) > 1 - 2C7-8. Recall that fl^ = F^^u^,) is C^^^/'^-radial for any v E T. 
Hence, according to Lemma |2.3t c). with positive probability of selecting the Gaussian 
random matrix F, the measure 



r^{lj,y)dX{v) = F* / fiydX{v) I = F*(Ai) 
Of \Joe J 

is C7-'='/'^-radial on R'^. □ 

The Grassmannian Gn,k of all fc-dimensional subspaces in R" carries a unique rota- 
tionally invariant probability measure, which will be referred to as the uniform probability 
measure on Gn.k- When F is a d x n standard Gaussian random matrix, the kernel of F is a 
random {n — (i)-dimensional subspace, that is distributed uniformly in the Grassmannian 
Gn,n-d- For a subspace E C M" we write E^ = {x E M";Vy E E,x ■ y — 0} for its 
orthogonal complement. 

Lemma 4.5 Let 0<k<n— Ibe integers, and let fi be a Borel probability measure on 
R" with /i({0}) = 0. Suppose that E is a random k-dimensional subspace, distributed 
uniformly in Gn,k- Then fi{E) — with probability one of selecting E. 

Proof By induction on k. The case fc = holds trivially. Suppose now that fc > 1, 
let n be such that k < n — 1, and let /i be a Borel probability measure on M" with 
/x({0}) — 0. Since /i({0}) = 0, there are at most countably many one-dimensional sub- 
spaces £ C M" with fj.{£) > 0. Let £ be a random one-dimensional subspace, distributed 
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uniformly in Gn,i- Then with probability one, /i(^) = 0. Denote v = {Proji±)^,^, a 
measure supported on an {n — 1) -dimensional subspace, with (^({0}) = 0. Let F be 
a random [k — 1) -dimensional subspace in l^, distributed uniformly. By the induction 
hypothesis, ^{F) = with probability one. Denoting E ~ Projj} {F), we see that 
^{E) ~ v{F) ~ with probability one. From the uniqueness of the Haar measure, E is 
distributed uniformly in Gn,k, and the lemma follows. □ 

Corollary 4.6 Let 1 < d < n be integers and let < e < 1/2. Let ji be a Borel 
probability measure on K" with /i({0}) = 0. Assume that 

sup / {x ■ efdn^iiix) < {cef'^. (27) 

Let T be a d X n standard Gaussian random matrix. Then, with positive probability of 
selecting T, the measure F^/i on M'' is e-radial proper Here, < c < 1 and C > 
1 are universal constants. (In fact, we have a lower bound of 1 — (ce)'^''/^'^ for the 
aforementioned probability.) 



Proof: Throughout this proof, we write G for the universal constant from Lemma |4~4l 
We define c = (10C)~^ and G = lOOC. It is elementary to verify that with this choice 
of universal constants, there exists an integer t such that 

i > ibG/sf'^ and {dsf'^ < 



Note that the left-hand side of dZTb is at least 1 / n. Indeed, 

sup / (x ■ 9)'^ dTZ* iJ.{x) 

> [ [ (x-9fdn,n(x)dad-i(e)=[ ^-^dn4n)(x) = -. 



We conclude that d < £ < n, and 

r ^2 

sup / {x ■ efdn^nix) < - 



' ' - 50£50- 

According to Lemma |431 there exists a Borel probability measure A on such that 



drv yj', J fiv dXiv) j < ey25. (28) 

Denote ly = J^^ iJ.ydX{v). From Lemma l4~4l the measure r,(z/) is (e/5)-radial, with 
probabiUty at least l—£^^ of selecting F, because i > {bG/e)^'^. Additionally, ^^^(F^/x, F* 
£^/25, by (|28] l. From Lemma |23l a) we thus learn that F*(/x) is e-radial, with positive 
probability of selecting F. Moreover, F*(/i)({0}) — ^(F^^(O)) = with probability 
one, according to Lemma 1431 Hence, with positive probability, F,(/z) is e-radial proper. 
□ 
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5 Selecting a position 



Our goal in this section is to find an appropriate invertible linear transformation T on 
R" such that T*/i satisfies the requirements of Corollary 14. 61 Our analysis is very much 
related to the results of Barthe [2], Carlen and Cordero-Erausquin |7 | and Carlen, Lieb 
and Loss jS). For x — {xi, . . . ,Xn) G R" we write x ® x for the n x n matrix whose 
entries are {xiXj)ij=i^,,,,n- For a probability measure on the unit sphere S*"^^ define 

M{n) = / (x (g) x)dn{x). 

Then M{fj,) is a positive semi-definite matrix of trace one. Clearly, for any 9 S R", we 
have M{fi)6 ■ 9 = Jg„-i{x ■ 9Yd^ji{x). More generally, for any subspace E C R", 

/ \ProjE{x)\^d^i{x) ^ Tr{ProjEM{fi)) = Tr(M{^i)ProjE), 

the trace of the matrix AI{ii)ProjE- A Borel probability measure fi on S*"^^ is called 
isotropic if AI{iJ.) — Id/n, where Id is the identity matrix. Observe that when n is 
isotropic, for any subspace E C R", 

/ \ProjEx\'dKx) = (29) 

In particular, fJ,{E) < dim{E)/n and hence an isotropic probability measure is neces- 
sarily decent in the sense of Definition 11.11 A Borel probability measure /i on R" with 
/z({0}) = is called "potentially isotropic" if there exists an invertible linear map T on 
R" such that (7?. o T)*/i is isotropic. 

Lemma 5.1 Let n be a Borel probability measure on S"~^ such that 

fi{H) = (30) 
for any hyperplane H C R" through the origin. Then fi is potentially isotropic. 

Proof: Given an invertible linear map T : R" R" we abbreviate Af^ (T) = M{{no 
T)^fj,). Then AI^{T) is a positive semi-definite matrix of trace one, and by the arithmetic- 
geometric means inequality, det Afp(r) < n^". Note that M^{T) = M^{XT) for any 
A > 0. Consider the supremum of the continuous functional 

T ^ &elM^,{T) (31) 

over the space of all invertible Unear operators T : R" R" of Hilbert-Schmidt norm 
one. 

We claim that the supremum is attained. Indeed, let Ti,T2, . . . be a maximizing 
sequence of matrices. By passing to a subsequence if necessary, we may assume that 
Ti — > T, for a certain matrix T of Hilbert-Schmidt norm one. We need to show that T is 
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invertible. Denote by E the image of T, a subspace of R". We need to show that E = R". 
For any x E 5"^^ which is not in the kernel of T, we have TiX Tx E E \ {0}, hence 

|Proj£^(7^o^0(x)^^0. (32) 

The kernel of T is at most (n — 1) -dimensional, since the Hilbert-Schmidt norm of T 
is one. According to dSOl l, the convergence in (|32] | occurs /x-almost-everywhere in x. 
Therefore, 

Tr{M^{T,)ProjE±) = [ \ProjE±{n o T,){x)\''dfi{x) 0. 

We conclude that if £' 7^ K", then det Mi^i_{Ti) — ^ 0, in contradiction to the maximizing 
property of the sequence (Ti)i>i. Hence E — R" and T is invertible. Thus the supremum 
of the functional ( [3T] i is attained for some invertible matrix Tq of Hilbert-Schmidt norm 
one. We will show that (7?. o To)*/! is isotropic. Without loss of generality we assume that 
To = Id (otherwise, replace /i with {TZ oTo)^,fi and note that this replacement does not 
affect the validity of the assumptions and the conclusions of the lemma). 

The matrix = M^{Id) is a positive semi-definite matrix of trace one. It is non- 

singular, thanks to (|30] l. and therefore M [ji) is in fact positive definite. Moreover, for any 
function u : S"^^ R which is positive /x-almost-every where and for any 6 E S*"^^, 

/ u{x){x ■ efdn{x) > 0. (33) 

Assume by contradiction that M{ii) is not a scalar matrix. Denote by Ai the largest 
eigenvalue of M (/i) = Mi^i_{Id), and let E C R" be the eigenspace corresponding to the 
eigenvalue Ai. Then 1 < dim(i?) < n — 1. For < 5 < 1 consider the hnear operator 

Ls{x)^x--5ProjE{X) (a; G R"). 

Then PtojeLs = (1 — 6)ProjE while ProjE±Ls = PtoJe-l- This means that TZo Lg 
strengthens the i?-'- -component of a given point in R", at the expense of its i?-component. 
More precisely, for any x E S*"^^ and < (5 < 1 there exists £^ > such that 

ProjE±iniLsx)) = {l + ei)ProjE±ix). 

Moreover, when x ^ E U E^ we have the inequality > e{x)6 for some e{x) > 
depending only on x. Consequently, for any < S < 1 and a non-zero vector 9 E E^, 

I iniLsx)-efdii{x)= [ il+eifix-efdii{x) 

> I {x-0fdfi{x)+2S I e{x){x ■ efdn{x). (34) 

The symmetric matrix M^{Ls) is of trace one, and it depends smoothly on S. Denote 
D = dMf^{Ls) /d6\s=o, ^ traceless symmetric matrix. According to our assumption 
( [30] l, the condition x ^ E (J E^ holds /i-almost-every where as 1 < dim(£') < n — 1. 
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Therefore e{x) > for /i-almost every x E S" ^. From ( |33] ) and ( |34] | we learn that for 

any 7^ e E'-^, 



> 0. (35) 

<5=0 



Recall that E is the eigenspace corresponding to the maximal eigenvalue Ai of M{fj.). 
Denote by A2 the second-largest eigenvalue, which is still positive but is strictly smaller 
than Ai. Then ProjE±M{fi)^^ > X^"^ ProjE^ in the sense of symmetric matrices. 
Using elementary linear algebra, we deduce from ( l35l l that 

Tr{ProjE±M{fi)-^D) > X-^Tr{ProjE±D) > 0. 

Since Tr{D) = then Tr{ProjED) = -Tr{ProjE±D) and 

dlogdet Mf,{Ls) 



= Tr {M{fi)-^D) 

5=0 

Tr{ProjEM{fiy^D) +Tr{ProjE±M{n)-^D) 

TriProjED) :_j.^^p^^^^^^^^yij^^ > (^_L _ _L ) Tr{ProjE±D) > 0, 



dS 



Ai yA2 Ai 

in contradiction to the maximality of det M (/i). Hence our assumption that M (/i) is not 
a scalar matrix was absurd. Since Tr{M{ii)) — 1 then M{fj,) — Id/n and fj, is isotropic. 
□ 

For a subspace E C M" and (5 > we write 

J\fs{E) = {rx]\x\ = l,r > 0,d(x,£;n S'""^) < 6} (36) 

where (i(a;, A) = infygA |a; — ?/|. Then A/^ (i?) is the projective (5-neighborhood of i?. We 
will need the following auxiliary continuity lemma. It is the only time in this text where 
the non-degeneracy conditions of Definition II. H are used. 

Lemma 5.2 Let n > I be an integer and let fi be a probability measure on R" with 
/i({0}) = such that 

^l{E) < dim(£;)/n 

for any subspace E C M" other than R" and {0}. Suppose there exists a sequence of 
potentially isotropic probability measures on 5*"^^ that converges to TZ^,fi in the weak* 
topology. Then /i is potentially isotropic. 

Proof: From the assumptions of the lemma, there exist Borel probability measures 
/xi, /Lt2, . . . on 5"^^ and invertible linear maps Ti,T2, . . . for which the following holds: 

• Mi TZ*n in the weak* topology; and 



{TZ o Ti)^,iii is isotropic for all i. 
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Without loss of generality we may assume that the Tj's are positive definite operators of 
trace one: If not, we will replace the operator Ti by rUiTi, where Ui is an orthogonal 
transformation such that UiTi is positive definite and r^^ is the trace of UiTi. Such 
a replacement does not affect the isotropicity of {TZ o Ti)^,^i. Furthermore, replacing 
Ti {i = 1,2, . . .) with a subsequence, we may assume that Ti T, where T is a positive 
semi-definite matrix of trace one. 

We claim that T is invertible. Assume by contradiction that T is singular. Denote 
by -E C M" the kernel of T, and set k = dim(i?). Then 1 < fc < n — 1 and hence 
fJ,{E) < k/n. Since E = f]s>QM5{E), then there exists S > such that 

liiMsiE)) < k/n. 

The set J^s{E) n S"~^ is closed in 5"^^ Since fii ^ 7?.*/x in the weak* topology then 



\im sup fi,{Us{E)) < n,fi{Us{E)) = fi{Us{E)) < 



(37) 



Recall that Ti T, that the T/s are self-adjoint, and that E is the kernel of T, hence E-^ 
is the image of T. This entails, roughly speaking, that for any x ^ E, the sequence TiX is 
"approaching E-^". In more precise terms, we conclude that for any x ^ Ms{E), 



Proji 



TjX 

YTx\ 



(38) 



Moreover, the convergence in dSST l is uniform over x G M" \Ns{E). Consequently, from 
^ and dM), 



lim inf 



Proji 



TjX 



Recall that {TZ o Ti)^fj.i is isotropic. According to 



dfii{x) ^\imM fii{S"-^\J\fs{E)) > 1--. 

(39) 



TiX 

\T^\ 



dfii{x) 



dim{E^) _ ^ k 
n n ' 



in contradiction to ( [39l ). Thus our assumption that T is singular was absurd, and T is 
necessarily invertible. 

Since Ti ^ T with T being invertible, we know that for any x E 5"^^, 

TiX i^ OO Tx 



mx\ \Tx\ 

and the convergence is uniform in 5"^^. Therefore, for any 6 G 5"^^, 



TiX 
\T^\ 



dfii{x) 



Tx 



e] dn^^i{x). 



(40) 



However, the left-hand side of (l40l l is always 1/n. We see that {TZoToTZ)^,fi — {TloT)^,fj, 
is isotropic, and therefore fi is potentially isotropic. □ 
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Corollary 5.3 Let nbe a positive integer and let ii be a Borel probability measure on 
with /i({0}) ~ such that 

fi{E) <dim{E)/n (41) 
for any subspace E C M" other than M" and {0}. Then p, is potentially isotropic. 

Proof: Consider a sequence /ii , /i2, ■ • ■ of Borel probability measures on 5"^^, abso- 
lutely continuous with respect to the Lebesgue measure on 5"^^, that converges to 7?.*/i 
in the weak* topology. The /i^'s are potentially isotropic by Lemma ISTl Therefore p is 
potentially isotropic according to Lemma |5T2l □ 

A clever proof of Corollary 15. 3| for the case where the measure p is discrete and has fi- 
nite support appears in the works of Barthe fT\, Carlen and Cordero-Erausquin fT Lemma 
3.5] and Carlen, Lieb and Loss |8|. We were not able to generalize their argument to the 
case of a general measure satisfying (ITTT l. The proof presented above is unfortunately 
longer, but perhaps it has the advantage of being geometrically straightforward. 

Lemma 5.4 Let n be a positive integer and a > 0. Suppose that p is an a-decent 
probability measure on R". Then for any < e < 1, there exists a linear transformation 
r : K" ^ M" such that v = r,(^) satisfies 

/ {x ■ ef dn^vix) < a + £ foralie e S"^^. 



Proof: By induction on the dimension n. The case n = 1 is obvious. Suppose that 
n > 2. We may assume that ii{H) < 1 for any hyperplane H C M" that passes through 
the origin (otherwise, invoke the induction hypothesis). We may also assume that a — 
supscK" 1^{E)/ <iim{E) where the supremumruns over all subspaces {0} ^ E C R". 
Corollary 15.31 take s care of the case where 

H{E) < dim{E)/n 

for any subspace E C M" with 1 < dim(£') < n — 1. We may thus focus on the 
case where there exists a proper subspace E C M" with p{E) > dim(£')/n. Clearly 
a > l/n. Consequently, there is a subspace E C M", with 1 < dim(£') < n — 1, such 
that a - e/(3n) < p{E) < 1. Let T : R" ^ R" be the map defined by 

, _ J ProjE±x x^E 
~ \ X xeE 

The map T may be viewed as a "stratified linear map" as in Furstenberg ||9]- Set A — 
p{E) > 0. The probability measure T^p on M" is supported on £' U E-^, and it may be 
decomposed as 

T^p = X^E + (1 - A)/^£;-L 

where = /^|_e is the conditioning of p to E, and Pe± is a certain probability measure 
supported on E-^. Clearly, pE — p\e is, (a/ A) -decent. Regarding Pe-^, let us select a 
subspace F C E-^. Then, pe-l {F) = if = {0} and otherwise 

(1 - X)fiE^ (F) = p{T-\F \ {0})) ^fi{{F®E)\E)^ fi{F ® E) ^ p{E) 
< a{Aim{E) + dim(F)) -{a- £/(3n)) dim(£;) < (a + e/3) dim(F) 
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where E (B F = {x + y;x E E,y E F} is the subspace spanned by E and F. Con- 
sequently, is an ((a + e/3)/(l — A))-decent measure on E-^. We may apply the 
induction hypothesis for /i e and ■ We conclude that there exists a linear transforma- 
tion : R" ^ M", with S{E) C E and ^(i;-^) C i;-^, such that 

/ {x-e)^d{TZoSoT)^n{x) <a + e/2 for any 6* e 5""^ (42) 

The problem is that S* o T is not a linear map. However, it is easy to approximate it 
by a linear map: For < S < 1 denote Tgx — x — SProjEX. Then [TZ o T){x) = 
lim^^i- {TZ o Ts){x) for any ^ x E M". Consequently, 

{TZ o S o Ts)^,^ = {TZ o S oTZ o Ts)^,fi {TZ o S oTZo T)^,ii = {TZ o S o T)^,ii 

in the weak* topology. We conclude that the matrices M{(TZo S oTs)^,fJ.) tend to M ( {TZ o 
S o T)^fi) as 6 ^ 1^. Hence, by ( l42l i. for some 6o < 1, 

/ {x-efd{TZoSoTs„)^n{x)<a + e for any 6* e 5*"-^ 

The map 5 o Tg^ is the desired Unear transformation. This completes the proof. □ 



6 Proof of the main theorem and some remarks 



Proof of Theorem UJt Suppose that /i is an ry-decent probability measure on R". Accord- 
ing to Lemma l54l there exists a linear map S : R" — > R" such that v ^ S^,ijl satisfies 



/ {x ■ efdTZ^v{x) < 2r] for all 9 E 5""^ 



We invoke Corollary 14.61 for the measure i^. We see that if the positive integer d and 
< e < 1/2 are such that 

then there exists a d x n matrix F for which the measure F* on R'^ is e-radial proper. 
Setting T = TS, ad x n matrix, we conclude that T*/i — F*;/ is a measure on R'^ which 
is e-radial proper. □ 



Proof of Corollary \1.4\ We may assume that n exceeds a given large universal con- 
stant. Denote d = cfVTogrTI and 5 = for a small universal constant < c < 1 
such that n > {C/S)'~^'^ where C is the universal constant from Theorem ll.3l According 
to Theorem ll.3l we may pass to a d-dimensional marginal and assume that our measure 
/i is a proper (5-radial measure on R"^. For < e R and L > let xt.L be the L-Lipschitz 
function on the real line which equals zero on (— oo, t] and one on [t + 1/L, oo). Recall 
the Kantorovich-Rubinstein duality as in (|5]l above. Then, for any probability measure v 
on the unit sphere S"*"^ and < i < c < 1/2, 

v{{x; xi > t}) > / XtA^i)d'^i^) ^ / Xt.d{^i)dcrd-i{x) ~ dWi{iy,aa_i), 
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where x — [xi, . . . , Xd) are the coordinates of x £ S'^~^. The integral with respect to 
(Trf_i may be estimated directly, and it is bounded from below by ce^*-^* (note that the 
marginal of Ud-i on the first coordinate has a density that is proportional to (1— t^)^^'^'^^ 
on [—1, 1]). We conclude that for any < i < c and an interval J = [a, b] C (0, oo) with 

t^[S[J)) > S, 

fJ-\s{J){{x;xi > at}) > n4n\s(j)){{x;xi > t}) 

> ce-^''''~d-Wiin,{fi\sij)),(Jd-i)>ce-^''''-dS>c'e-^''"''. (43) 

Similarly, for any interval J = [a, b] C (0, oo) with J)) > S, 

mIs(j) {{x; \xi\ > 206/%/rf}) < 7^,(M|s(,/)) ({a^; l^^il > 20/Vrf}) 

< / X2o/Vd-d-^d(\^^\)d'^d-lix)+d■W,in4^i\siJ)),c7d-l)<l/5, (44) 

where the integral with respect to ad-i is estimated in a straightforward manner. Let 
M > be a quantile with 

fi{{x; \x\ < M}) > 3/4 and \x\ > M}) > 1/4. 

Let a > be such that the interval J [a, M] satisfies fi{S{ J)) > 2/3. We apply ( |44] | 
for the interval J = [a, M] to deduce that 

>20Af/Vd}) < l + ^■^i\siJ) [{x;\xi\>20M/Vd]) < i + ^-i < i. 

(45) 

Suppose that M > satisfies (|3]l with the linear functional ip{x) — xi. We learn from 
(|43] | that necessarily M < 20A7/\/d. Let 6 > be such that the interval J = [M, b] 
satisfies fi{S{J)) > 1/5. We apply (|43] | for the interval J = [M, b] and conclude that, 

for any < t < cVd/ 20, 

H {{x; XI > tM}) > i • ^ls(j) ({x; xi > 20tM/Vdj^ > ^ exp (-4006"^^) . 

Since c\/d/20 > clog^^'^ n, the proof of the lower bound for /i({a;; xi > tM}) is com- 
plete. The proof of the lower bound for fJ.{{x; xi < —tM}) is almost entirely identical. 
The corollary is thus proven. □ 

Remarks. 

1 . It is conceivable that a more delicate analysis yields a better bound for R„ in Corol- 
larv ll.4l However, note that i?„ < C\/logn as is shown by the example where /i is 
distributed uniformly on n linearly independent vectors in K". Compare the "super- 
gaussian" tail behavior of Corollarv ll.4l with the almost sub-gaussian bounds in the 
convex case in 1 13 1 and in Giannopoulos, Paouris and Pajor 1 10 |. 

2. The central limit theorem for convex bodies lfT4l[T5]| states that any uniform proba- 
bility measure on a high-dimensional convex set has some low-dimensional marginals 
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that are approximately Gaussian. It is clear that there are perfectly regular proba- 
bility measures in high dimension (e.g., a mixture of two Gaussians) without any 
approximately Gaussian marginals. Therefore, a geometric condition such as con- 
vexity is indeed relevant when we look for approximately Gaussian marginals. For 
arbitrary high-dimensional measures without convexity properties, we may still 
state the more modest conclusion that some of the marginals are approximately 
spherically-symmetric, according to Theorem 1 1.31 There is no hope for approxi- 
mate Gaussians. 

Theorem 1 1 .31 bears a strong relation to the proof of the central limit theorem for 
convex bodies presented in lfT4l [TSl (see U6J for another proof, which at present 
works only for a subclass of convex bodies). That proof begins by showing that 
marginals of the uniform measure on a convex body are approximately spherically- 
symmetric. The approximation in |fT4l[T5l is rather strong compared to Theorem 
11.31 but nevertheless, a simple compactness argument enables us to leverage The- 
orem 11.31 in order to obtain the desired type of approximation. In principle, this 
approach yields a slightly different proof of the central limit theorem for convex 
sets, albeit with weaker estimates. 

The Euclidean structure with respect to which a random projection "works" with 
high probability seems a priori different in Theorem 11.31 and in the central limit 
theorem for convex bodies. In Theorem 1 1.3 1 we use the Euclidean structure with 
respect to which the covariance matrix of TZ^ fi is scalar, while in the central limit 
theorem for convex bodies, the most natural position is to require the covariance 
matrix of ^ itself to be a scalar matrix (compare also with fTSi, 1*211). For convex 
bodies, these Euclidean structures are close to each other, since most of the mass of 
a normalized convex body is located very close to a sphere (see IITtI ). 

3. The linear map T in Theorem ll.3l mav be assumed to be an orthogonal projection. 
This follows from the following simple observation we learned from G. Schecht- 
man: Any n-dimensional ellipsoid has an [n/2] -dimensional projection which is 
precisely a Euclidean ball. Therefore, in order to show that T may be chosen to 
be an orthogonal projection, one essentially has to verify that a [d/ 2] -dimensional 
marginal of an e-radial measure on M'' is lOOe^/^-radial. We omit the details. 

4. The isoperimetric inequality on the high-dimensional sphere, which is the corner- 
stone of the concentration of measure phenomenon (see Milman and Schechtman 
E4\ ). is not used in the proof of Theorem II. 3 1 We do apply Levy's lemma, which 
embeds the isoperimetric inequality, in the proof of Lemma |Z41 but only in d di- 
mensions. The dimension d here is typically not very large. 

5. For a positive integer d and e > denote by No{e,d) the minimal dimension 
with the following property: Whenever N > No{e, d), any A^-dimensional Banach 
space has a rf-dimensional subspace which is e-close to a Hilbert space. The clas- 
sical Dvoretzky's theorem states that No{e,d) < exp(C(i/£^), where C > is a 
universal constant (see Milman |23| and references therein). The power of 1/e in 
the exponent in the bound for No{s, d) can be made arbitrarily close to one at the 
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expense of increasing the universal constant C (see Schechtman fTT)). It is con- 
ceivable, however, that these bounds are still far from optimal; perhaps No{e,d) 
can be made as small as {C je)'-^'^! See Milman ||22|| for a discussion of this con- 
jecture. An affirmative answer for the case d — 2 was given by Gromov I22l . using 
a topological argument which does not seem to generalize to higher dimensions. 

The analogy with the present article suggests to try and use Theorem 1 1.3 1 or ideas 
from its proof, in order to improve the bounds in Dvoretzky's theorem. Further- 
more, the operation of marginal is dual, via the Fourier transform, to the operation 
of restriction to a subspace. So, for instance, suppose a norm || • || in R" may be 
represented as 



for a compactly-supported probability measure /i on M". In this case, we may 
consider subspaces E C K" for which {ProjE)*fJ' is e-radial, and expect that the 
restriction of || • || to these subspaces is close, in a certain sense, to the Euclidean 
norm. See Koldobsky fW, Chapter 6] for a comprehensive discussion of norms 
admitting representations in the spirit of (|46] |. 

While this approach may possibly yield some meaningful estimates for some classes 
of normed spaces, it has limitations. Theorem |1.3| is proven by considering a ran- 
dom marginal with respect to an appropriate Euclidean structure, i.e., a projection 
of the given measure to a subspace which is distributed uniformly over the Grass- 
mannian of all d-dimensional subspaces in R". However, for Banach spaces such 
as a random subspace is not sufficiently close to a Hilbert space (see Schecht- 
man 1*28 1), and there are better choices than the random one. (Indeed, the £^ norm 
cannot be represented as in ( |46] | or in a similar way, see Theorem 6. 1 3 in Koldobsky 
|fT9] |, going back to Misiewicz). A direct application of Theorem 1 1.3 1 is thus quite 
unlikely to provide new information regarding approximately Hilbertian subspaces 
for all finite-dimensional normed spaces. 

6. In principle, the measures T*(/i) in Theorem 11.31 are not only approximately ra- 
dial, but are also approximately a composition of isotropic Gaussians. Indeed, it is 
well-known that any d-dimensional marginal of the measure Ufc-i, for d ^ fc, is 
approximately an isotropic d-dimensional Gaussian measure. Thus, we may project 
an approximately-radial measure on R'^ to any d-dimensional subspace, and obtain 
a measure which is approximately, in some sense, a composition of isotropic Gaus- 
sians. We did not rigorously investigate this approximation property on a precise, 
quantitative level. 

7 Infinite-dimensional spaces 

This section contains a corollary to Theorem |1.3l pertaining to probability measures sup- 
ported on infinite-dimensional spaces. We begin with a lemma regarding distributions on 
finite-dimensional spaces. Let n > 1 be an integer, suppose that /i is a Borel probability 
measure on R" and let < a < 1. A subspace E C M" is "a-basic for /i" if 




(46) 
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(i) m(-E) > a 

(ii) fJ,{F) < a for any proper subspace F C E. 

Note that any subspace E C K" with fJ.{E) > a contains an a-basic subspace. Also, 
suppose T : R" R™ is a linear map, and let E C M" be an a-basic subspace for /i 
containing the kernel of T. Then T{E) is a-basic for (/i). 

Lemma 7.1 Lef n > Ibe an integer, < a < 1, and let iibe a Borel probability measure 
on M". Then there are only finitely many subspaces E C M" that are a-basic for p,. 

Proof: Let fc > be an integer and < a < 1. We will prove by induction on k the 
following statement: For any integer n > k and for any Borel probability measure p on 
M", there are at most finitely many subspaces E C R" whose dimension is at most fc, that 
are a-basic for the measure /i. The statement clearly implies the lemma. The case fc = 
is easy, as there is only one 0-dimensional subspace in R". 

Let k > 1. Suppose that n > fc is an integer, < a < 1 and let p he a Borel 
probability measure on R". Denote by Q the family of all subspaces E C R" whose 
dimension is at most k that are a-basic for the measure /i. We need to show that 

#{G) < oo. (47) 

First, note that it is sufficient to prove ( |47] l under the additional assumption that p{{0}) = 
0. Indeed, denote e = p{{0}). If a < e then there is only one a-basic subspace in R", 
which is the subspace {0}, and (|47] | clearly holds. In the non-trivial case where a > e, 
we may replace phy {p — e6o) /(I — e) and a by (a — e) /(I — e). The family of basic 
subspaces remains exactly the same. From now on, we will thus assume that /i({0}) = 0. 

Denote hy £ C Q the collection of all subspaces E C R" that are a-basic for /i, with 
dim(£') < fc, for which p{F) < a^/8 for any proper subspace F C. E. We will prove 
that 

#(£) < 2/a < oo. (48) 

To that end, let £ be any finite subset of £, and denote N — #(f ). For any two distinct 
subspaces Ei, E2 G £, we have fJ,{Ei O E2) < a^/8 as £'1 n £^2 is a proper subspace of 
El. According to the inclusion-exclusion principle, 

where we used the fact that fJ,{E) > a for any E E £, since E is a-basic. We conclude 
that 

Na] 



N(N - l)a2 

1 > iVa i — > Na 

16 



10 



\Na - 5| > 3. (49) 



Thus, there are no finite subsets of £ whose cardinality is iV = [^/al: In this case 
2 < Na < 3 which is impossible according to (|49] l. Hence < 2/a and (|48] | is 

proven. 
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Next, denote by Q the family of all subspaces E C R" that are a-basic, with dim{E) < 
k, for which there exists a proper subspace F C. E with > In view of ( |48] |, 

in order to deduce ( |47] ) it suffices to show that 

#iG) < (50) 

Whenever a subspace E C R" contains a proper subspace F C E with > a^/8, 

it also contains an a^/8-basic proper subspace F C E with dim(F) < fc — 1. By the 
induction hypothesis, there are only finitely many subspaces F C R" that are a^/8-basic 
for /i whose dimension is at most fc — 1. Fix such an a^/8-basic subspace F. Let !F be the 
collection of all subspaces E C M" that are a-basic, contain F, and satisfy dim(i?) < fc. 
The task of proving (|50] l and completing the proof of the lemma is reduced to showing 
that 

#(^) < ^■ 

Note that dim(F) > 1 as /i({0}) = < a^/S, and hence {0} is not an a^/8-basic 
subspace. Denote by P = Projp± the orthogonal projection operator onto F-^ in R". 
Then v — P^, {ji) is a Borel probability measure on F^. For any E ^ T, the subspace 
P{E) is an a-basic subspace for the measure v, and dim(P(i?)) = dim(_E) — dim(F) < 
fc — 1. From the induction hypothesis, we see that the set {P{E);E e !F} is finite. 
However, P{Ei) ^ P{E2) for any distinct i?2 G Thus < oo, as promised. 

The lemma is proven. □ 

An alternative proof of Lemma ItTI was suggested by N. Alon. His idea is to replace 
the first part of the proof of the induction step with the known fact that there exists a finite 
set A C R" that intersects any subspace of measure at least a (see, e.g., Alon and Spencer 
im Section 13.4]). 

We write R°° for the linear space of infinite sequences a = (oi, 02, . . .) with a^ € R 
for alH > 1. The space R°° is endowed with the standard product topology (also known 
as Tychonoff's topology) and the corresponding Borel a-algebra. The projection map 

P„ : R°° -> R" is defined by 

for X = (xi, X2, . . .) G M°°. Then P„ is a continuous, linear map. Note that any finite- 
dimensional subspace E C R°° is a closed set. Also for any subspace E C we 
have 

AiTn{E) = sup dim(P„ (£;)). (51) 

n 

With a slight abuse of notation, for m > n > 1 we also write P„ : M™ M" for 
the projection operator defined by P„(a;i, . . . , Xm) — (a^i, • • ■ , a;„). We will also use 
the ridiculous space R° — {0}, and Pq{x) = for any x. Let e > and let X be 
a measurable linear space in which all finite-dimensional subspaces are measurable. A 
probability measure /i on X is called e-decent if for any finite-dimensional subspace 
EQX, 

H{E) <edi-ai{E). 

Lemma 7.2 Let e > and let 11 be a Borel probability measure on Suppose that fj, 
is e-decent. Then, there exists N > 1 such that (P;v)*/i is 2e-decent. 
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Proof: For rt > denote /i„ = (Pn)*fJ; a Borel probability measure on R". We say 
that a subspace E C M" is "thick" if /^„(i?) > 2edim(£'). A thick subspace E C M" is 
necessarily of dimension at most (2e)^^. We say that i? is a "primitive, thick subspace" 
if it is thick and additionally 

< 2edim(F) 

for any proper subspace F C E. Clearly, any thick subspace E C M" contains a 
primitive, thick subspace. Observe also that a primitive, thick, fc-dimensional subspace 
E C M" is necessarily 2efc-basic for the measure fi„ - From Lemma lTTI we thus learn that 
for any n, there are only finitely many primitive, thick subspaces E C M". 

Denote by V the collection of all pairs {E, n) such that E C M" is a primitive, thick 
subspace. In order to prove the lemma, it suffices to show that V is finite. Indeed, in 
this case, set N = max{n + 1;3E C R", {E,n) £ V}. Then there are no primitive, 
thick subspaces in R^, and hence there are no thick subspaces in M^. Consequently 
I^N = {Pn)*{i^) is (2£)-decent, and the lemma is proven. The rest of the argument is 
thus concerned with the proof that V is finite. 

Define a directed graph structure on V as follows: There is an edge going from the 
node {E, n) e V to the node {F,n + I) £ V if and only if i? C P„(F). Note that for 
each node {F, n + 1) G V, the subspace Pn{F) C W is clearly thick, hence it contains 
a primitive, thick subspace E C M". Therefore each node (F, n + 1) is connected to a 
certain node n) E V. We conclude that there is a path from ({0}, 0) € V to any node 
in V. For each n > 1 there are only finitely many nodes of the form {E, n) G V, since 
there are only finitely many primitive, thick subspaces E C M". Therefore, V is finite if 
and only if it does not contain an infinite path. 

We deduce that in order to prove the lemma, it suffices to show that there is no se- 
quences of subspaces En C M" (n = 0, 1, . . .) such that for any n > 0, 

En^Pn{En+i) and {En,n)GV. (52) 

Assume by contradiction that a sequence of subspaces satisfying (|52] | exists. Recall that 
a subspace of dimension larger than {2e)^^ cannot be thick, hence dim{En) is bounded 
by (2e)^^. Additionally, dim(_E„) < dim(ii^„+i) for all n. Therefore, there exist no > 1 
and d < (2e)"^ such that 

dim(i?„) = d for all n > uq. 
Consequently, En — P„(i?„+i) for any n> uq. Consider the direct limit 

F = {a e R°° ; Pn{a) G En for all n > uq} C R°°. 

Then E = n„>„oP,7^(ii'„) is a subspace of R"^ with Pn{E) = En for all n > no- 
Furthermore, dim(i?) = d according to dSTT l. Note that P,7^(£'„) D P,7_^j(£'„+i) for 
any n > no- Therefore 

^i{E) = n Pn^i^n) ] = lim ^i {Pn\En)) = lim {En) > 2ed, 

\ I n — *oo n — >QC 

\n>no J 
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since £"„ C R" is a d-dimensional thick subspace. Hence > 2ed, in contradiction 
to our assumption that ii is e-decent. We conclude that there are no infinite paths in V, 
and hence that V is finite .The lemma is proven. □ 

Suppose X is a topological vector space. We say that X has a countable separat- 
ing family of continuous, linear functionals if there exist continuous linear functionals 
/i, /2, . . . : X ^ M such that for any x E X, 

X = Vn, fn{x) — 0. 

This condition is not too restrictive. For example, any separable normed space, any sepa- 
rable Frechet space, and any topological vector space dual to a separable Frechet space - 
admits a countable, separating family of continuous, linear functionals. 

Corollary 7.3 Let e > 0, let d > 1 be an integer, and let X be a topological vector space 
with a countable separating family of continuous, linear functionals. Suppose that p is 
an e-decent Borel probability measure on X. Then, there exists a continuous linear map 
T : X such that T.j, (p) is 6-radial proper, for S — ce'^^'^. Here, c > is a universal 

constant. 

Proof: Let /i,/2, . . . : X R be the separating sequence of continuous linear 
functionals. Then the linear map T : X ^ R°° defined by 

r(x) = (/i(x),/2(x),...) 

is a continuous linear embedding. Since p is e-decent, then also T^,{p) is an e-decent, 
Borel probability measure on R°°. According to Lemma |T2l there exists a finite > 1 
and a continous linear map P : R°° — > R^ such that {P o T)* (p) is a 2e-decent measure 
on R^. The corollary now follows from Theorem lL3l □ 

Note that the linear map T in Corollarv 17. 3 1 is not only measurable but also continu- 
ous. In principle, we could have formulated Corollarv 17. 3 1 for a probability measure on a 
measurable linear space, without having to rely on an ambient topology: All we need is a 
linear, measurable embedding in R°°. We refer the reader to Tsirelson |30 | for a discus- 
sion of measures on infinite-dimensional linear spaces, and for an exposition of Vershik's 
"de-topologization" program Ii31ii32i . We conclude this note with an infinite-dimensional 
analog of Corollarv 1 1.41 

Corollary 7.4 Let X be a topological vector space with a countable separating family of 
continuous, linear functionals. Suppose that fi is a Borel probability measure on X such 
that ^{E) = for any finite-dimensional subspace E C X. Then, for any R > 0, there 
exists a non-zero, continuous linear functional Lp : X -^M. such that 

p ({x; Lp{x) > tM}) > c exp(-Ci2) for allO<t<R 

and 

^i{{x;ip{x) < -tM}) > cexp(-Ci^) forallO <t<R 
where M > is a median, that is, 

/i({a;;|(^(a;)| < Af}) > 1/2 and ^ ({a;; > M}) > 1/2 

and c, C > are universal constants. 
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